Results (PhD Chapter 2)


This series of files compile all analyses done during Chapter 2:

All analyses have been done with R 3.6.0.

Click on the table of contents in the left margin to assess a specific analysis
Click on a figure to zoom it

To assess Section 2, click here.
To go back to the summary page, click here.


Human activities considered for the analyses:


Workspace preparation

Here, we use data from subtidal ecosystems (see metadata files for more information)
Only stations that have been sampled both for abiotic parameters and benthic species were included.
The script below includes personnal functions, refined data, parameters for each campaign and global means, sd, se.


1. Maps

i) Bathymetry

Depth

Isobaths

ii) Human activities maps

For each human activity, its influence has been modelled at each station based on the distance from the source of the activity.

Here is the distance table (in meters):

We computed a score based on distances. No weight has been applied yet, but this is planned for the next iterations.

Distance (m) Score Colour
0 - 250 5 Black (#1D1D1D)
250 - 500 4 Brown (#8B3A3A)
500 - 1000 3 Red (#EE0000)
1000 - 2500 2 Orange (#FFA500)
2500 - 5000 1 Yellow (#FFFF00)
> 5000 0 Grey (#E3E3E3)

CityInf

InduInf

DredSit

MoorSit

RainSew

WastSew

CityWha

InduWha

Cumulative score

2. Kriging of parameters values

This step has been done with the seperate R scripts, written by Aurélie Foveau and collaborators, then plotted on ArcGIS.

Some of the scripts return warnings, where some initial values are said to be problematic. According to Aurélie, we will use the results all the same. In ArcGIS, spherical kriging has been used on the kriged values, for the graphical representation because this method allows a better smoothing. However, ArcGIS does not like data with too many zeros, which is the case for %gravel values and E. parma abundance.

3. Species Distribution Models: GLMs for B. neotena

See scripts from Aurélie in the SDM/ folder (AUTHORIZATION REQUIRED FOR DIFFUSION). They have been filled so that they work with the 2017 BSI dataset. The GLM models allow to study the realized habitat (mean of responses).

  1. Scatterplots allow to have a graphical representation of the abundance of the interested species along selected explanatory variables of the dataset (“fit”)
  2. Correlation matrix to support above hypotheses
  3. Test of the colinearity between variables with selection according to VIF (deletion until all are < 2.5)
  4. Visualization of the species distribution (to test normality, better for SDM), to see if transformation is needed.
  5. Creation of new columns, with (i) PA only and (ii) positive values + NA only
  6. Verification of the data new structures and summuries after transformation
  7. Calculation of squared variables to test polynomial relationships. In a previous version of the script, interactions between variables were investigated, but this made the script less stable apparentely.

i) GLM on PA data

  1. Modelisation on binomial data: null model and full model with selected variables
  2. Model selection with AIC and BIC (backward or forward), on data and squared data
  3. Validation of the selected models with the test dataset (“pred”)
  4. Use of ROC (receiving operator characteristics, to see if a model is characteristic or not) area method to predict values according to the selected models
  5. Calculation of R2 values for each prediction, selection of the model with the highest
  6. Visualization on a Taylor diagram (TODO check for interpretation)
  7. Check of the models diagnostics
  8. Plot of presence probability for the species
  9. Calculation of the coefficients (needed for GIS representation, because it will allow to predict the species presence through the selected model)

Interpretation:
From this analysis, we gather that the presence of B. neotena (0 or 1) is mainly driven by the content on organic matter and medium sand.

ii) GLM on positive values

  1. Modelisation on positive values: null model and full model
  2. Model selection with AIC and BIC (backward or forward), on data and squared data
  3. Validation of the selected models with the test dataset (“pred”)
  4. Use of ROC (receiving operator characteristics, to see if a model is characteristic or not) area method to predict values according to the selected models
  5. Calculation of R2 values for each prediction, selection of the model with the highest
  6. Visualization on a Taylor diagram (TODO check for interpretation)
  7. Check of the models diagnostics
  8. Plot of presence probability for the species
  9. Calculation of the coefficients (needed for GIS representation, because it will allow to predict the species presence through the selected model)

Interpretation:
From this analysis, we gather that the positive abundances of B. neotena are mainly driven by the depth, content on fine sand, clay and several squared coefficients.

iii) Combination of the two GLMs to get the final model

As GLMs are highly sensitive to zeros, the previous decomposition has been done. To get the final model, they need to be multiplied. ArcGIS => use of the raster calculator and the calculated coefficients for the selected variables.


Elliot Dreujou

2019-07-25